Skip to content

Conversation

@spiridonov
Copy link
Contributor

@spiridonov spiridonov commented Oct 8, 2025

What this PR does / why we need it:

I found an inconsistent behavior in records returned from eval. If there is a mathematical expression, then it returns a new column and the caller must release it. For predicates (in filter.go) it returns a new column if a predicate is non-trivial. If a predicate is just a ColumnRef (for a boolean column) it will return this column as-is, and releasing this column will cause panic (too many releases). So I added an explicit Retain() on all nodes of an expression, so now it is consistent and the caller must call Release() on all records from eval() without worrying was it a column reference or an expression.

I started by adding memory.CheckedAllocator to all tests and that's where I found leaks in the first place.

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Checklist

  • Reviewed the CONTRIBUTING.md guide (required)
  • Documentation added
  • Tests updated
  • Title matches the required conventional commits format, see here
    • Note that Promtail is considered to be feature complete, and future development for logs collection will be in Grafana Alloy. As such, feat PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/setup/upgrade/_index.md
  • If the change is deprecating or removing a configuration option, update the deprecated-config.yaml and deleted-config.yaml files respectively in the tools/deprecated-config-checker directory. Example PR

@spiridonov spiridonov requested a review from a team as a code owner October 8, 2025 14:05
@spiridonov spiridonov enabled auto-merge (squash) October 8, 2025 14:05
@trevorwhitney trevorwhitney disabled auto-merge October 8, 2025 20:00
Copy link
Collaborator

@trevorwhitney trevorwhitney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I disabled auto-merge as I have a few small nits, the biggest of which I'd like to move the defer x.Release() closer toe the creation of x, more open/close style, where possible.

Comment on lines 162 to 168
defer func() {
for _, a := range arrays {
a.Release()
}
}()

return array.NewRecord(schema, arrays, ct)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the defer Release approach when it's in an open/close situation, so for example immediately after creating an array. So, how would you feel about either a) moving this up into the builders loop above, so

for i, builder := range builders {
		arrays[i] = builder.NewArray()
        defer arrays[i].Release()
	}

or b) storing the new record in a variable and then releasing synchronously (ie. not in a defer)?

Copy link
Contributor Author

@spiridonov spiridonov Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was an existing pattern that I followed. I am not a fan of doing that without defer because there is risk of missing important cleanup in case of early returns or failures. Using defer is as simple and reliable as finally in Java.

return failureState(err)
}
fields = append(fields, arrow.Field{Name: columnNames[i], Type: vec.Type().ArrowType(), Metadata: types.ColumnMetadata(vec.ColumnType(), vec.Type())})
projected = append(projected, vec.ToArray())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we do the defer release here, so it's closer to it's creation, ie. open/close style

return nil, fmt.Errorf("unsupported datatype for partitioning %s", vec.Type())
}

arrays = append(arrays, vec.ToArray().(*array.String))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above, can we do the release here?

return nil, fmt.Errorf("unsupported datatype for grouping %s", vec.Type())
}

arrays = append(arrays, vec.ToArray().(*array.String))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto


t.Run("filter with true literal predicate", func(t *testing.T) {
alloc := memory.DefaultAllocator
alloc := memory.NewCheckedAllocator(memory.NewGoAllocator())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason we can't use memory.DefaultAllocator instead of instantiating a new GoAllocator()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be inconsistent between test files and GoAllocator was the one I randomly copy-pasted from. I fixed all test files to use DefaultAllocator.

@spiridonov spiridonov merged commit 16dab82 into main Oct 9, 2025
63 checks passed
@spiridonov spiridonov deleted the spiridonov-eval-arrow-memory branch October 9, 2025 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants